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Abstract 

We discuss several tests for whether a given set of independent and identically 
distributed (i.i.d.) draws does not come from a specified probability density function. 
The most commonly used are Kolmogorov-Smirnov tests, particularly Kuiper's vari- 
ant, which focus on discrepancies between the cumulative distribution function for the 
specified probability density and the empirical cumulative distribution function for the 
given set of i.i.d. draws. Unfortunately, variations in the probability density function 
often get smoothed over in the cumulative distribution function, making it difficult to 
detect discrepancies in regions where the probability density is small in comparison 
with its values in surrounding regions. We discuss tests without this deficiency, com- 
plementing the classical methods. The tests of the present paper are based on the plain 
fact that it is unlikely to draw a random number whose probability is small, provided 
that the draw is taken from the same distribution used in calculating the probability 
(thus, if we draw a random number whose probability is small, then we can be confi- 
dent that we did not draw the number from the same distribution used in calculating 
the probability). 

Key words: Kolmogorov-Smirnov, nonparametric, goodness-of-fit, outlier, distribution 
function, nonincreasing rearrangement 



1 Introduction 

A basic task in statistics is to ascertain whether a given set of independent and identically 
distributed (i.i.d.) draws X±, X 2 , . . . , A n _i, X n does not come from a distribution with a 
specified probability density function p (the null hypothesis is that Xi, X 2 , . . . , X n _ ly X n 
do in fact come from the specified p). In the present paper, we consider the case when 
Xx, X 2 , . . . , X n _i, X n are real valued. In this case, the most commonly used approach is 
due to Kolmogorov and Smirnov (with a popular modification by Kuiper); see, for example, 
Sections 14.3.3 and 14.3.4 of [9], p], [15], or Section EJ below. 
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The Kolmogorov-Smirnov approach considers the size of the discrepancy between the 
cumulative distribution function for p and the empirical cumulative distribution function 
defined by X%, X 2 , . . ., X n _i, X n (see, for example, Sections [2] and [3] below for definitions 
of cumulative distribution functions and empirical cumulative distribution functions). If the 
i.i.d. draws X\, X 2 , . . . , X n -i, X n used to form the empirical cumulative distribution function 
are taken from the probability density function p used in the Kolmogorov-Smirnov test, then 
the discrepancy is small. Thus, if the discrepancy is large, then we can be confident that 
Xi, X 2 , . . . , X n _i, X n do not come from a distribution with probability density function p. 

However, the size of the discrepancy between the cumulative distribution function for p 
and the empirical cumulative distribution function constructed from the i.i.d. draws X±, X 2 , 
. . . , X n _i, X n does not always signal that X\, X 2 , . . . , X n -\, X n do not arise from a distri- 
bution with the specified probability density function p, even when X\, X 2 , . . . , X n _i, X n 
do not in fact arise from p. In some cases, n has to be absurdly large for the discrepancy to 
be significant. It is easy to see why: 

The cumulative distribution function is an indefinite integral of the probability density 
function p. Therefore, the cumulative distribution function is a smoothed version of the 
probability density function; focusing on the cumulative distribution function rather than p 
itself makes it harder to detect discrepancies in regions where p is small in comparison with 
its values in surrounding regions. For example, consider the probability density function p 
depicted in Figure [1] below (a "tent" with a narrow triangle removed at its apex) and the 
probability density function q depicted in Figure |2] below (nearly the same "tent," but with 
the narrow triangle intact, not removed). The cumulative distribution functions for p and 
q are very similar, so tests of the classical Kolmogorov-Smirnov type have trouble signaling 
that i.i.d. draws taken from q are actually not taken from p. Section 14.3.4 of [5] highlights 
this problem and a strategy for its solution, hence motivating us to write the present article. 

We propose to supplement tests of the classical Kolmogorov-Smirnov type with tests for 
whether any of the values p(Xi), p(X 2 ), . . . , p(X n _i), p(X n ) is small. If any of these values 
is small, then we can be confident that the i.i.d. draws Xi, X 2 , . . . , X n _i, X n did not arise 
from the probability density function p. Theorem 13.31 below formalizes the notion of any 
of p(Xi), p(X 2 ), . . . , p(X n _i), p(X n ) being small. We also propose another complementary 
test, which amounts to using the Kolmogorov-Smirnov approach after "rearranging" the 
probability density function p so that it is nondecreasing on the shortest interval outside 
which it vanishes (see Remark 12. II and formula (jl]) below). 

For descriptions of other generalizations of and alternatives to the Kolmogorov-Smirnov 
approach (concerning issues distinct from those treated in the present paper), see, for exam- 
ple, Sections 14.3.3 and 14.3.4 of [9], PQ, [3], [5], [6], [7], g], [10], [TT], Q3], [IS], [E], and their 
compilations of references. For a more general approach, based on customizing statistical 
tests for problem-specific families of alternative hypotheses, see [2J. Below, we compare the 
test statistics of the present article with one of the most commonly used test statistics of 
the Kolmogorov-Smirnov type, namely Kuiper's (see, for example, [16], [15], or Section [3] 
below). We recommend using the test statistics of the present paper in conjunction with the 
Kuiper statistic, to be conservative, as all these statistics complement each other, helping 
compensate for their inevitable deficiencies. 

There are at least two canonical applications. First, the tests of the present article can be 
suitable for checking for malfunctions with and bugs in computer codes that are supposed to 
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Table 1: Notational conventions 



mathematical object 



typeface 



example 



probability density function 

cumulative distribution function defined in (jTJ) 

distribution function defined in (J2J) 

taking the probability of an event 



italic lowercase 
italic uppercase 
script uppercase 
bold uppercase 



p(x) 
P(x) 
V(x) 



P{X < x 



} 



generate pseudorandom i.i.d. draws from specified probability density functions (especially 
the complicated ones encountered frequently in practice). Good software engineering requires 
such independent tests for helping validate that computer codes produce correct results (of 
course, such validations do not obviate careful, structured programming, but are instead 
complementary). Second, many theories from physics and physical chemistry predict (often 
a priori) the probability density functions from which experiments are supposed to be taking 
i.i.d. draws. The tests of the present paper can be suitable for ruling out erroneous theories 
of this type, on the basis of experimental data. Moreover, there are undoubtedly many other 
potential applications, in addition to these two. 

For definitions of the notation used throughout, see Section [2J Section [3] introduces several 
statistical tests. Section @] illustrates the power of the statistical tests via some numerical 
examples. Section [5] draws several conclusions and proposes directions for further work. 



In this section, we set notation used throughout the present paper. 

We use P to take the probability of an event. We say that p is a probability density 
function to mean that p is a (Lebesgue-measurable) function from R to [0, oo) such that the 
integral of p over R is 1. 

The cumulative distribution function P for a probability density function p is 



for any real number x. If X is a random variable distributed according to p, then P(x) is 
just the probability that X < x. Therefore, if X is a random variable distributed according 
to p, then the cumulative distribution function V for p(X) is 



the probability that p(X) < x. 

For reference, we summarize our (reasonably standard) notational conventions in Tabled] 

Remark 2.1. The "nonincreasing rearrangement" (or nondecreasing rearrangement) of a 
probability density function (see, for example, Section V.3 of [H]) clarifies the meaning of 
the distribution function V defined in With P defined in ([[]) and V defined in $Z§, 
V{p{x)) = P(x) for any real number x in the shortest interval outside which the probability 
density function p vanishes, as long as p is increasing on that shortest interval. 



2 Notation 




y<x 





(2) 
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3 Test statistics 



In this section, we introduce several statistical tests. 

One test of whether i.i.d. draws X\, X 2 , X n ^i, X n do not arise from a specified 
probability density function p is the Kolmogorov-Smirnov test (or Kuiper's often preferable 
variation). If X is a random variable distributed according to p, then another test is to use 
the Kolmogorov-Smirnov or Kuiper test for the random variable p(X), whose cumulative 
distribution function is V in (J2J). The test statistic for the original Kuiper test is 



U=(y/n sup P(x)-P(x)) - [Vn inf P(x) - P{x)) 

V -oo<x<oo / V -°o<a;<oo J 



(3) 



where P(x) is the empirical cumulative distribution function — the number of k such that 
X k < x, divided by n. The test statistic for the Kuiper test for p(X) is therefore 

V=(y/n sup V(x)-V(x))-(y/n inf V(x)-V(x)) , (4) 

V 0<x<oo / V 0<z<oo J 

where V(x) is the number of k such that p(Xk) < x, divided by n. Remark 12.11 above and 
Remark 13.61 below provide some motivation for using V, beyond its being a natural variation 
on U. 

The rationale for using statistics such as U and V is the following theorem, corollary, 
and the ensuing discussion (see, for example, Sections 14.3.3 and 14.3.4 of [9], [15], or [16] 
for proofs and details). 

Theorem 3.1. Suppose that p is a probability density function, X is a random variable 
distributed according to p, and P is the cumulative distribution function for X from (QJ). 
Then, the distribution of P(X) is the uniform distribution over [0,1]. 

Corollary 3.2. Suppose that p is a probability density function, X is a random variable 
distributed according to p, and V is the cumulative distribution function forp(X) from (TJ|). 
Then, the cumulative distribution function of V(p(X)) is less than or equal to the cumula- 
tive distribution function of the uniform distribution over [0,1]. Moreover, the distribution 
ofV(p(X)) is the uniform distribution over [0, 1] if V is a continuous function (V is a con- 
tinuous function when, for every nonnegative real number y, the probability that p(X) = y 
is 0). 

Theorem 13.11 generalizes to the fact that, if the i.i.d. draws Xi, X 2 , . . . , X n _x, X n arise 
from the probability density function p involved in the definition of U in ([3]), then the 
distribution of U does not depend on p\ the distribution of U is the same for any p. With 
high probability, U is not much greater than 1 when the i.i.d. draws X 1; X 2 , . . . , X n _i, X n 
used in the definition of U in are taken from the distribution whose probability density 
function p and cumulative distribution function P are used in the definition of U. Therefore, 
if the statistic U that we compute turns out to be substantially greater than 1, then we can 
have high confidence that the i.i.d. draws X%, X 2 , . . . , X n _i, X n were not taken from the 
distribution whose probability density function p and cumulative distribution function P 
were used in the definition of U. Similarly, if V defined in (TjO) turns out to be substantially 
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greater than 1, then we can have high confidence that the i.i.d. draws Xi, X 2} . . . , X n _i, X n 
were not taken from the distribution whose probability density function p and distribution 
function V were used in the definition of V. For details, see, for example, Sections 14.3.3 
and 14.3.4 of [9], p3], or [16]. 
A third test statistic is 

W = n min V(p(X k )). (5) 

l<fc<n 

The following theorem and ensuing discussion characterize W and its applications. 

Theorem 3.3. Suppose that p is a probability density function, n is a positive integer, 
X\, X 2 , . . . , X n -\, X n are i.i.d. random variables each distributed according to p, V is the 
cumulative distribution function for p(Xi) from (TJP ; and W is the random variable defined 
in |3J]. Then, 

P{W<x) < 1- (6) 

for any x G [0, n] . 

Proof. It follows from (J5J) that 

P{W > nx] = P{P(p(Xi)) > x and V{p{X 2 )) > x and . . . and V{p{X n )) > x] (7) 
for any x G [0, 1]. It follows from the independence of Xi, X 2 , . . . , X n _i, X n that 

n 

P{V(p(X 1 )) > x and V(p(X 2 )) > x and . . . and V(p(X n )) > x] = JJ P{V(p(X k )) > x] 

k=l 

(8) 

for any x G [0, 1]. It follows from Corollary 13.21 that 

P{V(p(X k )) >x} > l-x (9) 

for any x G [0, 1] and k = 1, 2, . . . , n — 1, n. Combining ((Tj), ([8]), and © yields (E]). □ 

For any positive real number a < 1/2, we define 

x a = n-n(l-a) 1/n ; (10) 

if W < x a , then due to (EJ) we can have at least [100(1 — a)]% confidence that the i.i.d. 
draws X\, X 2 , . . . , X n _i, X n do not arise from p. It follows from fllOp that 

a <x a < - ln(l - a) = a + a 2 / 2 + «V 3 + « 4 /4 + . . . < a + a 2 , (11) 

with x a = a for n — 1, and lim^oo x a = — ln(l — a). Therefore, if W < a, then we have 
at least [100(1 — a)]% confidence that the i.i.d. draws X\, X 2 , . . . , X n -\, X n do not arise 
from p. Taking a = .01, for example, we have at least 99% confidence that the i.i.d. draws 
X 1 , X 2 , . . . , X n _i, X n do not arise from p, if W < .01. 

In short, for any positive real number a < 1/2, if the statistic W defined in (jSJ) is at most 
a, then we have at least [100(1 — a)]% confidence that the i.i.d. draws Xi, X 2 , . . . , X n _i, X n 
do not arise from the probability density function p used in (JSJ). If however W is greater than 
a + a 2 , then (]6]) provides no basis for claiming with at least [100(1 — a)]% confidence that 
the i.i.d. draws Xi, X 2 , . . . , X n _i, X n do not arise from the probability density function p 
used in (jHJ). 
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Remark 3.4. If W defined in (j3j) is at most 1, then we can have at least [100(1 — VF)]% 
confidence that the i.i.d. draws X\, X 2 , X n _x, X n do not arise from the probability 
density function p used in (J5J) . 

Remark 3.5. Using W defined in (jHJ) along with the upper bound (ED is optimal when the 
probability density function p takes on only finitely many values, or when p has the property 
that, for every nonnegative real number y, the probability is that p(X) = y, where X is 
a random variable distributed according to p. In both cases, the inequality fl6]) becomes the 
equality 

P{W < n V(p(x))} = 1 - (l - V(p(x))Y (12) 

for any i£i 

Remark 3.6. When the statistic W defined in (jSJ) is not powerful enough to discriminate 
between two particular distributions, then a natural alternative is the average 

w = I E w ( 13 ) 

l<fc<n 

The Kuiper test statistic V defined in (jl]) is a more refined version of this alternative, and 
we recommend using V instead of W, in conjunction with the use of W and U defined in (J3j). 
We could also consider more general averages of the form 

f[\ E 9{v{p{X k )))\ (14) 

V l<k<n J 

where / and g are functions; obvious candidates include f{x) = exp(x) and g(x) = ln(x), 
and f(x) = 1 — x 1 / 9 and g(x) = (1 — x) q , with q E (1, oo). 

Remark 3.7. To clarify further, let us consider the case n — 1. If we are given a probability 
density function p and a draw X (not necessarily from p) such that V(p(X)) is small, 
where V is defined in ([2]) for p, then why can we be confident that X was not drawn from 
a distribution with probability density function pi Well, if V(p(X)) is small, then the 
likelihood of drawing X from a distribution with probability density function p is small, in 
the sense that we would be at least as confident that any draw Y satisfying p(Y) < p(X) 
does not arise from a distribution with probability density function p, and the probability 
under p of all such draws is just V(p(X)), which is small (by assumption). 

4 Numerical examples 

In this section, we illustrate the effectiveness of the test statistics of the present paper via 
several numerical experiments. For each experiment, we compute the statistics U, V, and 
W defined in (j3J), @, and (JSJ) for two sets of i.i.d. draws, first for i.i.d. draws X\, X 2 , 
. . . , X n _i, X n taken from the distribution whose probability density function p, cumulative 
distribution function P, and distribution function V are used in the definitions of U, V, 
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and W in (j3J), @, and (J3J) , and second for i.i.d. draws X 1; X 2 , . . . , X n _i, X n taken from a 
different distribution. 

The test statistics U and denned in fl3]) and are the same, except that U concerns a 
random variable X drawn from a probability density function p, while V concerns p(X). We 
can directly compare the values of U and V for various distributions in order to gauge their 
relative discriminative powers. Ideally, U and V should not be much greater than 1 when the 
i.i.d. draws Xi, X 2 , . . . , X n _i, X n used in the definitions of U and V in (j3J) and (jl]) are taken 
from the distribution whose probability density function p, cumulative distribution function 
P, and distribution function V are used in the definitions of U and V; U and V should be 
substantially greater than 1 when the i.i.d. draws X\, X2, ■ ■ ■ , X n -\, X n are taken from a 
different distribution, to signal the difference between the common distribution of each of 
X\, X 2 , . . . , X n _i, X n and the distribution whose probability density function p, cumulative 
distribution function P, and distribution function V are used in the definitions of U and V. 

For details concerning the interpretation of and significance levels for the Kuiper test 
statistics U and V defined in fl3]) and @, see Sections 14.3.3 and 14.3.4 of j9], [16], or [T5] ; 
both one- and two-tailed hypothesis tests are available, for any finite number n of draws 
Xi, X 2 , . . . , X n _i, X n , and also in the limit of large n. In short, if Xx, X 2 , . . . , X n _i, X n 
are i.i.d. random variables drawn according to a continuous cumulative distribution function 
P, then the complementary cumulative distribution function of U defined in ([3]) for the 
same cumulative distribution function P has an upper tail that decays nearly as fast as the 
complementary error function. Although the details are complicated (varying with n and 
with the form — one-tailed or two-tailed — of the hypothesis test), the probability that U 
is greater than 2 is at most 1% when X\, X 2 , . . . , X n ^\, X n used in (JHJ) are drawn according 
to the same cumulative distribution function P as used in §S§. 

As described in Remark I3.4[ the interpretation of the test statistic W defined in (JS]) is 
simple: If W defined in (JSJ) is at most 1, then we can have at least [100(1 — W)]% confidence 
that the i.i.d. draws Xi, X 2 , . . . , X n _i, X n do not arise from the probability density function 
p used in (EJ). 

Tables [2HH] display numerical results for the examples described in the subsections below. 
The following list describes the headings of the tables: 

• n is the number of i.i.d. draws X±, X 2 , . . . , X n -\, X n taken to form the statistics U, 
V, and W defined in ®, and ©. 

• Uq is the statistic U defined in (J3J), with the Xx, X 2 , . . . , X n _i, X n defining P in (j3J) 
drawn from a distribution with the same cumulative distribution function P as used 
in (J3J). Ideally, Uq should be small, not much larger than 1. 

• U\ is the statistic U defined in (J3]), with the X\, X 2 , . . . , X n __i, X n defining P in (j3J) 
drawn from a distribution with a cumulative distribution function that is different from 
P used in ([3]). Ideally, U\ should be large, substantially greater than 1, to signal the 
difference between the common distribution of each of X 1 , X 2 , . . . , X n _i, X n and the 
distribution with the cumulative distribution function P used in (j3J). The numbers 
in parentheses in the tables indicate the order of magnitude of the significance level 
for rejecting the null hypothesis, that is, for asserting that the draws X\, X 2 , . . . , 
X n -i, X n do not arise from P. 
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• Vo is the statistic V defined in (@|, with the Xi, X 2 , . . . , X n _i, X n defining V in (j3J) 
drawn from a distribution with the same probability density function p used for V and 
for V in (jlj). Ideally, Vq should be small, not much larger than 1. 

• V\ is the statistic V defined in 01]), with the Xi, X 2 , . . . , X n _i, X n defining V in (jlj) 
drawn from a distribution that is different from the distribution with the probability 
density function p used for V and for V in 01]). Ideally, V\ should be large, substantially 
greater than 1, to signal the difference between the common distribution of each of 
Xi, X 2 , . . . , X n _i, X n and the distribution with the probability density function p 
used for V and for V in 01]). The numbers in parentheses in the tables indicate the 
order of magnitude of the significance level for rejecting the null hypothesis, that is, 
for asserting that the draws X\, X 2 , . . . , X n -\, X n do not arise from p. We used (16] 
to estimate the significance level; this estimate can be conservative for V. 

• Wo is the statistic W defined in (JH]), with the X\, X 2 , . . . , X n -i, X n in (JS]) drawn from 
a distribution with the same probability density function p and distribution function 

V in (J5}. Ideally, Wo should not be much less than 1. 

• Wi is the statistic W defined in (JSJ), with the X\, X 2 , . . . , X n _i, X n in (jSJ drawn 
from a distribution that is different from the distribution with the probability density 
function p used in (JSJ) (p is used both directly and for defining the distribution function 

V in (jSJ)). Ideally, Wi should be small, substantially less than 1, to signal the difference 
between the common distribution of each of X\, X 2 , . . . , X n _i, X n and the distribution 
with the probability density function p used in ([5]). W\ itself is the significance level 
for rejecting the null hypothesis, i.e., for asserting that the draws do not arise from p. 

4.1 A sawtooth wave 

The probability density function p for our first example is 

(] _ f 2 ■ KT 3 ■ (x - k), x e (k, k + 1) for some k G {0, 1,. . .,998,999} , . 

\ 0, otherwise 

for any xsE. The corresponding cumulative distribution function P defined in ([T]) is 

( 10~ 3 ■ (x - k) 2 + 10~ 3 ■ k, xe [k, k + 1] for some k G {0,1,..., 998, 999} 
P{x) = I 0, x < 

{ 1, x > 1000 

(16) 

for any x G R. The distribution function "P defined in (J2]) is 

<p( T ) — / 10 6 ^ 2 /4, XG[0,2-10- 3 ] 
y 1, x>2-10- 3 

for any nonnegative real number x. 

We compute the statistics U, V, and W defined in (j3J), (Jlj), and (jSJ) for two sets of i.i.d. 
draws, first for i.i.d. draws distributed according to p defined in ( fl5l) . and then for i.i.d. draws 
from the uniform distribution on (0, 1000). Table [2] displays numerical results. 
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Table 2: A sawtooth wave 



n 


u 


U x 


V 


V, 


w 


W 1 


10 1 


.13E1 


.12E1 


.11E1 


.14E1 


.24E1 


.49E-2 


10 2 


.12E1 


.18E1 


.10E1 


.21E1 


.37E0 


.45E-1 


10 3 


.82E0 


.79E0 


.13E1 


.81E1 (10~ 54 ) 


.18E1 


.10E-2 


10 4 


.12E1 


.17E1 


.13E1 


.25E2 (10~ 7E2 ) 


.30E1 


.72E-4 


10 5 


.10E1 


.12E1 


.18E1 


.79E2 (10- 7E3 ) 


.18E0 


.34E-4 


10 6 


.81E0 


.14E1 


.12E1 


.25E3 (10- 7E4 ) 


.11E1 


.HEM 


10 7 


.15E1 


.19E1 


.18E1 


.79E3 (10~ 7E5 ) 


.13E1 


.38E-8 



For this example, the classical Kuiper statistic U is unable to signal that the draws from 
the uniform distribution do not arise from p defined in (fl5|) for n < 10 7 , at least not nearly 
as well as the modified Kuiper statistic V, which signals the discrepancy with very high 
confidence for n > 10 3 . The statistic W signals the discrepancy with high confidence for 
n > 10 3 , too. 

4.2 A step function 

The probability density function p for our second example is a step function (a function 
which is constant on each interval in a particular partition of the real line into finitely many 
intervals). In particular, we define 

( 1(T 3 , x E (2k - 1, 2k) for some fcG {1,2,..., 998, 999} 
p(x)=< 10" 6 , x G (2k, 2k + 1) for some A; G {0, 1, 2, ... , 998, 999} (18) 
[ 0, otherwise 

for any i£R. The corresponding cumulative distribution function P defined in (pQ) is 



P(x) 



1(T 6 • k + 10~ 3 ■ (x - k), x G [2k — 1, 2k] for some k G {1, 2, ... , 998, 999} 
1(T 6 ■ (x - k) + lCT 3 • k, x G [2k, 2k + 1] for some k G {0, 1,2,..., 998, 999} 

0, x < 

1, x > 1999 

(19) 

for any x G R. The distribution function V defined in ([2]) is 




V(x) = { 10- 3 , x G [10~ 6 , 10 -3 ) (20) 
x > 10" 

for any nonnegative real number x. 

We compute the statistics U, V, and W defined in (j3J), (jlj), and (jSJ) for two sets of i.i.d. 
draws, first for i.i.d. draws distributed according to p defined in (|18p . and then for i.i.d. draws 
from the uniform distribution on (0, 1999). Table [3] displays numerical results. 
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Table 3: A step function 



n 


U 


Ut 


Vq 




V\ 


W 


W x 


10 1 


.11E1 


.12E1 


.32E- 


-2 


.13E1 


.10E2 


.OlEO 


10 2 


.11E1 


.18E1 


.10E- 


-1 


.46E1 (10~ 16 ) 


.10E3 


.10EO 


10 3 


.10E1 


.81E0 


.32E- 


-1 


.16E2 (10- 2K2 ) 


.10E1 


.10E1 


10 4 


.15E1 


.17E1 


.10E- 


-1 


.50E2 (10~ 3K3 ) 


.10E2 


.10E2 


10 5 


.11E1 


.12E1 


.22E- 


-1 


.16E3 (10- 3E4 ) 


.10E3 


.10E3 


10 6 


.70E0 


.15E1 


.19E- 


-1 


.50E3 (10~ 3K5 ) 


.10E4 


.10E4 


10 7 


.65E0 


.33E1 (10~ 8 ) 


.12E- 


-1 


.16E4 {lQ-' im ) 


.10E5 


.10E5 



For this example, the classical Kuiper statistic U is unable to signal that the draws from 
the uniform distribution do not arise from p defined in ( TT8l) for n < 10 6 , at least not nearly as 
well as the modified Kuiper statistic V, which signals the discrepancy with high confidence 
for n > 10 2 . The statistic W does not signal the discrepancy for this example. 



4.3 Another step function 

The probability density function p for our third example is a step function (a function 
which is constant on each interval in a particular partition of the real line into finitely many 
intervals). In particular, we define 

(\-S V 10 ; z e (2ife, 2A; + 1) for some A; e {0, 1, . . . , 8, 9} , , 

\ 0, otherwise 

for any x G R. The corresponding cumulative distribution function P defined in (JT]) is 

(z-Jfe)/10, xG [2Jfe,2Jfe + l] for some k G {0, 1, . . . , 8, 9} 
p , . , {k + l)/W, x G [2k + 1,2k + 2} for some k G {0, 1, . . . , 8, 9} 
' (J ' ~ ^ 0, x < [ZZ) 

1, x > 19 

for any x G R. The distribution function V defined in (J2J) is 

V( X ) = {H< ^ (23) 

for any nonnegative real number x. 

We compute the statistics U, V, and W defined in (j3J), (jl]), and (jSJ) for two sets of i.i.d. 
draws, first for i.i.d. draws distributed according to p defined in (|21|) . and then for i.i.d. draws 
from the uniform distribution on (0, 19). Table H] displays numerical results. 

For this example, the classical Kuiper statistic U signals that the draws from the uniform 
distribution do not arise from p defined in (I2T]) for n > 10 3 , but not nearly as well as the 
modified Kuiper statistic V, which signals the discrepancy with high confidence for n > 10 2 . 
For this experiment, the statistic W signals the discrepancy with perfect 100% confidence 
for all numbers n in the table. 
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Table 4: Another step function 



n 


U 






V 


Vi 




W Q 


W x 


10 1 


.14E1 


.11E1 




.OOEO 


.19E1 




.10E2 


.OOEO 


10 2 


.10E1 


.19E1 




.OOEO 


.51E1 (10- 




.10E3 


.OOEO 


10 3 


.10E1 


.29E1 (10" 




.OOEO 


.15E2 (10- 




.10E4 


.OOEO 


10 4 


.14E1 


.95E1 (10- 


-76^ 


.OOEO 


.46E2 (10- 




.10E5 


.OOEO 


10 5 


.14E1 


.31E2 (10- 


-1E3\ 


.OOEO 


.15E3 (10- 


-2E4^ 


.10E6 


.OOEO 


10 6 


.81E0 


.95E2 (10- 


-1E4\ 


.OOEO 


.47E3 (10- 


-2E5^j 


.10E7 


.OOEO 


10 7 


.11E1 


.30E3 (10- 


-1E5\ 


.OOEO 


.15E4 (10- 


-2E6^ 


.10E8 


.OOEO 



4.4 A bimodal distribution 



The probability density function p for our fourth example is 



p(x) 



x/10100, x E [0, 100] 

(101 -x)/101, x E [100,101] 

(x- 101)/101, x E [101,102] 

(202 - x)/10100, x E [102, 202] 

0, otherwise 



(24) 



for any x E R. Figure [T] plots p. The corresponding cumulative distribution function P 
defined in (fT]) is 



P(x) 



x 2 /20200, x E [0, 100] 

(-10100 + 202x - x 2 )/202, x E [100, 101] 

(10302 - 202x + x 2 )/202, x E [101, 102] 

(-20604 + 404x - x 2 )/20200, x E [102, 202] 

0, x < 

1, x > 202 

for any i£i The distribution function V defined in ([2]) is 

V (r\-\ ( 101x ) 2 ' [0,1/101] 
1 {X) \ 1, x > 1/101 



(25) 



(26) 



for any nonnegative real number x. 

We compute the statistics U, V, and W defined in (jSJ), (jlj), and (JSJ) for two sets of i.i.d. 
draws, first for i.i.d. draws distributed according to p defined in (jUj) , and then for i.i.d. draws 
distributed according to the probability density function q defined via the formula 

x/101 2 , x E [0,101] 
(202 -a;)/101 2 , x E [101,202] 

for any x E R. Figure [2] plots q. Table [5] displays numerical results. 

For this example, the classical Kuiper statistic U signals that the draws from q defined 
in (1271) do not arise from p defined in (124")) for n > 10 5 , and the modified Kuiper statistic V 
is inferior. The statistic W signals the discrepancy with high confidence for n > 10 4 . 



q(x) 



(27) 
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Table 5: A bimodal distribution 



n 


U 


U x 




V 


V x 




w 


Wx 


10 1 


.11E1 


.14E1 




.11E1 


.14E1 




.11E0 


.98E-0 


10 2 


.15E1 


.15E1 




.11E1 


.12E1 




.37E0 


.19E-0 


10 3 


.11E1 


.10E1 




.10E1 


.13E1 




.21E1 


.70E-1 


10 4 


.12E1 


.19E1 




.15E1 


.11E1 




.70E0 


.68E-3 


10 5 


.10E1 


.33E1 (10- 




.11E1 


.18E1 




.88E0 


.40E-3 


10 6 


.65E0 


.99E1 (10- 


-82 


.68E0 


.57E1 


(io- 25 ) 


.14E0 


.25E-7 


10 7 


.89E0 


.31E2 (10- 


-1K3\ 


.66E0 


.16E2 


( 10 -2K2) 


.29E0 


.25E-6 



Figure 1: The bimodal probability density function defined in fl24|) 




X 



Figure 2: The unimodal probability density function defined in ( 1271) 




X 
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Table 6: A differentiable density function 



n 


U 






V 


v 1 


W 


Wi 




10 1 


.12E1 


.74E0 




.11E1 


.11E1 


.14E1 


.11E- 


-2 


10 2 


.14E1 


.11E1 




.18E1 


.30E1 (IO" 5 ) 


.13E1 


.17E- 


-3 


10 3 


.15E1 


.14E1 




.92E0 


.57E1 (1(T 26 ) 


.51E0 


.22E- 


-4 


10 4 


.86E0 


.22E1 


(io- 3 ) 


.12E1 


.16E2 (1(T 2E2 ) 


.91E0 


.12E- 


-5 


10 5 


.12E1 


.58E1 


(io- 27 ) 


.12E1 


.52E2 (1(T 3K3 ) 


.72E0 


.12E- 


-6 



4.5 A differentiable density function 

The probability density function p for our fifth example is 

( x \ _ / Ce ~ ]A ( 2 + cos(137rs) + cos(397ra;)), x G [-1, 1] , . 

\ 0, otherwise 

for any x G R, where C ~ .4 is the positive real number chosen such that p(x) dx = 1. 
Figure [3] plots p. We evaluated numerically the corresponding cumulative distribution func- 
tion P defined in (CD), using the Chebfun package for Matlab described in [T7j. Figure H] 
plots P. We evaluated the distribution function V defined in (j2J) using the scheme described 
in the appendix below (which is also based on Chebfun). Figure [5] plots V. 

We compute the statistics U, V, and W defined in (j3J), (T4|), and (jSJ) for two sets of i.i.d. 
draws, first for i.i.d. draws distributed according to p defined in fl28l) . and then for i.i.d. draws 
distributed according to the probability density function q defined via the formula 

a(x) = { e- W /(2-2c- 1 ), XG[-1,1] 
[ 0, otherwise 

for any x G R. Table O displays numerical results. 

For this example, the classical Kuiper statistic U signals that the draws from q defined 
in fl29|) do not arise from p defined in (1281) for n > 10 4 , but not nearly as well as the modified 
Kuiper statistic V, which signals the discrepancy with high confidence for n > 10 2 . The 
statistic W signals the discrepancy with high confidence for n > 10 2 , too. 

Remark 4.1. For all numerical examples reported above, at least one of the modified Kuiper 
statistic V or the "new" statistic W is more powerful than the classical Kuiper statistic U, 
usually strikingly so. However, we recommend using all three statistics in conjunction, to be 
conservative. In fact, the statistics V and W of the present article are not able to discern 
certain characteristics of probability distributions that U can, such as the symmetry of a 
Gaussian. The classical Kuiper statistic U should be more powerful than its modification 
V for any differentiable probability density function that has only one local maximum. For 
a differentiable probability density function that has only one local maximum, the "new" 
statistic W amounts to an obvious test for outliers — nothing new (and far more subtle 
procedures for identifying outliers are available; see, for example, [12] and [1]). Still, as the 
above examples illustrate, V and W can be helpful with probability density functions that 
have multiple local maxima. 
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Figure 5: The distribution function V defined in (j2J) for ( |28l ) 
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5 Conclusions and generalizations 

In this paper, we complemented the classical tests of the Kolmogorov-Smirnov type with 
tests based on the plain fact that it is unlikely to draw a random number whose probability 
is small, provided that the draw is taken from the same distribution used in calculating the 
probability (thus, if we draw a random number whose probability is small, then we can be 
confident that we did not draw the number from the same distribution used in calculating 
the probability). The numerical examples of Section H] illustrate the substantial power of the 
supplementary tests, relative to the classical tests. 

Needless to say, the method of the present paper generalizes straightforwardly to proba- 
bility density functions of several variables. There are also generalizations to discrete distri- 
butions, whose cumulative distribution functions are discontinuous. 

If the probability density function p involved in the definition of the modified Kuiper test 
statistic V in fll]) takes on only finitely many values, then the confidence bounds of [TJ)] , [IB] , 
and Sections 14.3.3 and 14.3.4 of [H] are conservative, yielding lower than possible confidence 
levels that i.i.d. draws X%, X 2l . . . , X n -x, X n do not arise from p. It is probably feasible 
to compute the tightest possible confidence levels (maybe without resorting to the obvious 
Monte Carlo method), though we may want to replace V with a better statistic when p takes 
on only finitely many values; for example, when p takes on only finitely many values, we 
can literally and explicitly rearrange p to be nondecreasing on the shortest interval outside 
which it vanishes, and use the Kolmogorov-Smirnov approach on the rearranged p. 

Even so, the confidence bounds of [15], [16], and Sections 14.3.3 and 14.3.4 of [9] for the 
modified Kuiper test statistic V in (j4l are sharp for many probability density functions p. For 
example, the bounds are sharp if, for every nonnegative real number y, the probability is 
that p{X) = y, where X is a random variable distributed according to p. This covers many 
cases of practical interest. In general, the tests of the present article are fully usable in their 
current forms, but may not yet be optimal for certain classes of probability distributions. 
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Appendix 

In this appendix, we describe numerical methods for constructing the distribution function 
V defined in fl2]). We would be surprised if our methods turn out to be ideal in any regard, 
but they seem to be adequate for our purposes, and can leverage others' software packages 
to ease the implementation. 

The basis of our implementation is the Chebfun package for Matlab, described in [17]. In 
addition to its other capabilities, Chebfun provides tools for the representation of piecewise- 
smooth real-valued functions on bounded intervals of the real line via Chebyshev series on 
adaptively chosen subintervals of the domain. Chebfun can transform such representations 
in myriad ways, including forming their derivatives and indefinite integrals. Furthermore, 
Chebfun can calculate many interesting characteristics of functions represented in this way, 
including local and global extrema. 

Suppose that p is a piecewise-smooth probability density function that has only finitely 
many local extrema on the shortest interval outside which p vanishes. Then, to compute the 
distribution function V defined in ([2]) for a representation in Chebfun of p, we perform the 
following four steps: 

1. Locate the local extrema of p on its computational domain (its computational domain 
being the shortest closed interval outside which p vanishes). 

2. Partition the computational domain of p into disjoint subintervals whose endpoints 
are the local extrema of p; on each such subinterval, p is either nondecreasing or 
nonincreasing. 

3. On each subinterval from Step 2, form the indefinite integral of p, using the subinterval's 
endpoint at which p is smaller for the lower limit of integration. 

4. Compute a representation in Chebfun of the function V(x) given by summing up 
the absolute values of the indefinite integrals from Step 3, evaluating the indefinite 
integrals at the points y where p(y) — x; if x is greater than the greatest value of p 
on a subinterval from Step 2, then add in the greatest absolute value of the indefinite 
integral on the subinterval, while if x is less than the least value of p on a subinterval 
from Step 2, then add in the least absolute value of the indefinite integral (namely 0). 
On each subinterval from Step 2 for which there exists a point y such that p(y) = x, 
compute the point y via bisection, trying the Newton method after 10 bisections (and 
reverting to bisection if the Newton method fails to produce accuracy of a digit less 
than the machine precision after 5 Newton steps). 
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(Step 4 describes a procedure for evaluating V(x) at an arbitrary point x. Given this 
procedure, Chebfun automates the construction of a highly accurate representation of V 
that can be evaluated efficiently at arbitrary points.) 
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